DOCUMENT RESUME 



ED 062 397 



TM 001 336 



AUTHOR 

TITLE 

PUB DATE 
NOTE 



EDRS PRiCE 
DESCRIPTORS 



IDENTIFIERS 



Baker, Frank B. ; Hoyt, Cyril J. 

The Relation of the Method of Reciprocal Averages to 
Guttman's Internal Consistency Scaling Model. 

Apr 72 

19p. ; Paper presented at the annual meeting of the 
American Educational Research Association (Chicago, 
Illinois, April 1972) 

MF- $0.65 HC-13.29 

Analysis of Variance; Correlation; *Evaluation 
Methods; *lnternal Scaling; *Item Analysis; 
Mathematical Applications; *Mathematical Models; 
Psychometrics; Scores; statistical Analysis 
♦Guttmans Internal Consistency Scaling Model 



ABSTRACT 

A scaling technique known as the Method of Reciprocal 
Averages has been in use since the early 1930’s. This technique 
yields a set of item response weights for a psychological inventory 
which maximizes the internal consistency of the inventory for a group 
of subjects. Although the technique has been used for many years, its 
mathematical foundations have not been made explicit. In the present 
paper, it is shown that the informal data processing procedures of 
this technique actually solve the set of linear equations yielded by 
Guttman’s Least Squares Model for internal consistency scaling. The 
Method of Reciprocal Averages can be implemented as a simple 
extension to existing item analysis computer programs. (Author/CK) 
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Abstract 

A scaling technique known as the Method of Reciprocal Averages 
has been in use since the early 1930’s. This technique yields a set 
of item response weights for a psychological inventory which maximizes 
the internal consistency of the inventory for a group of subjects. 
Although the technique has been used for many years, its mathematical 
foundations have not been made explicit. In the present paper it is 
shown that the informal data processing procedures of this technique 
actually solve the set of linear equations yielded by Guttman’s Least 
Squares Model for internal consistency scaling. The constraint 
imposed by Guttman to insure that the solution yields a nonextraneous 
set of weights is also met. From a computational point of view the 
Method of Reciprocal Averages has an advantage over the principal 
components approach employed by Guttman* s solution as it does not 
require the calculation of an item response category co-occurrence 
matrix. In addition, The Method of Reciprocal Averages can be 
implemented as a simple extension to existing item analysis computer 
programs. 
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THE RELATION OF THE METHOD OF RECIPROCAL AVERAGES TO GUTMAN'S 
INTERNAL CONSISTENCY SCALING MODEL 

Over the years a cannon practice among psychologists has been 
the creation of ad hoc computational procedures yielding various 
scores, indices, loadings, etc. which aid in the understanding of 
data. Such procedures often were developed within the context of 
a particular study to meet some practical need of the researcher. 

In some cases, the general usefulness of the procedure led to the 
derivation of an underlying mathematical rationale and what was once 
an ad hoc procedure developed into standard psychometric technique. 

A good example of this process is factor analysis where Spearman's 
early procedures were developed by later workers into a mathematically 
sophisticated major area of psychometrics. The field of psychological 
scaling is also one in which many ad hoc procedures have been developed 
and some of these have become established techniques. One of these 
ad hoc scaling procedures that has been used by researchers for many 
years (Klausmeier, Quilling § Wardrop, 1968; Mitzel 6 Hoyt, 1954; 
Moiser, 1942) and currently implemented in a widely distributed 
computer program (Baker, 1960; Baker 8 Martin, 1969) is the Method 
of Reciprocal Averages. This scaling procedure yields a set of item 
response weights for a psychological inventory that maximizes the 
internal consistency index of the inventory for a group of subjects. 
Despite its use over a period of many years, the procedure remains 
an ad hoc one in that an explicit mathematical model for the Method 
of Reciprocal Averages has not appeared in the literature. However, 
a general mathematical model for internal consistency scaling has been 
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provided by Guttman (1941). It is the purpose of the present paper 
to demonstrate that although the Method of Reciprocal Averages 
preceeded Guttman' s (1941) work, it is actually a particular 
implementation of that model. 

The Method of Reciprocal Averages has its origin in a scaling 
procedure partially described by Richardson and Kuder (1933) . The 
procedure in the article was not named, but it became well known to 
psychometricians of the era as both Horst (1935) and Guttman (1941) 
attributed the Method of Reciprocal Averages to M. Richardson, citing 
the 1933 article. A detailed description of the data processing and 
computational procedures of the Method of Reciprocal Averages were 
not avilable until they were presented by Moiser (1946) . Attempts 
to provide a formal mathematical model for this scaling procedure were 
also spread over a considerable period of years. Guttman (1941) 
provided a general model for internal consistency scaling based upon 
a least squares approach. Hosteller (1949) developed a scaling 
technique in which only the positive response to an item was weighted 
and a set of equations were solved for the item response weights that 
maximized the internal consistency index. The equations solved under 
this approach were essentially the same as those due to Guttman (1941) . 
In an unpublished paper, Hoyt and Collier (1953) showed empirically 
that in the single response situation, the Method of Reciprocal Averages 
and Mosteller's techniques yielded the same item response weights. 
Suggesting for this case at least, that a connection exists between 
the Method of Reciprocal Averages and Guttman' s (1941) least squares 
model. In addition, Guttman (1941) mentioned he felt that his model 

4 



A 



I 




3 

and the Method of Reciprocal Averages were related, but he did not 
pursue the issue. On the surface, it is not obvious that the 
Method of Reciprocal Averages is a particular implementation of 
the solution of the equations yielded by Guttman's least squares 
approach. The former involves rather informal data processing 
procedures whereas the latter is based upon a complex mathematical 
approach. In order to fully develop the relationship between the 
two, the existing bases of both are presented below and then the 
relationship is shown. 

The Method of Reciprocal Averag es 

Moiser’s (1946) procedures were designed for implementation on 
punched card equipment and are reformulated here to agree with the 
computer program due to Baker and Martin (1969). Fundamentally, the 
situation is one involving a population of N subjects who respond to 
a universe of m items, where: each item has two or more possible 
item response categories, each subject can select only one item 
response category per item, but must respond to all items in the 
universe of items. Thus, there will be m items having a total of r 
possible item response categories. The basic assumption is that a 
single variable underlies the items in the universe. The goal, then, 
is to obtain a set of item response category weights Xj (j=l,2, . . . ,r) 
which will maximize the internal consistency index of the instrument 
for the population of subjects. 

The Method of Reciprocal Averages is an iterative procedure in 
which an a priori set of item response weights is used to obtain a score 
for each subject, then the scores in conjunction with the subject’s 
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item response choices are used to derive a new set of item response 
weights. The derived weights are then used to obtain a new score 
for each individual and the iterative process is continued until 
a convergence criterion is met. The final set of item response 
weights will be those which maximize the internal consistency index 
for the group of subjects on the given instrument. In the following 
paragraphs the procedural steps are presented and a notational scheme 
for representing the variables involved is developed. 



Step A 

The investigator assigns an a priori (though not necessarily 
distinct) weight to each of the r possible item response categories 
in the instrument. These weights are usually integer numbers 
ranging in value from unity to some arbitrary upper limit. Let 
X k (k»l,2,3, . . . ,r) denote an arbitrary set of item response weights , 
the a priori weights at this stage. 



Step B 

The a priori weights are used as a scoring key and a total score 
for each subject is obtained. Let "ik’ 1 if the i-th subject 
(i®: t 2,...,N) chooses the k-th item response category e ik a 0, 
otherwise. A subject's total score T^ is given by the sum of the 
item weights corresponding to his choices. 

r 



( 1 ) 



T. 

l 



A 



e ik X k 



/ 



Step C 

The mean score for all subjects choosing a given item response 
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is computed for each item response category (the mean item response 
score). Let e.. * 1 if the i-th subject responds to the j-th 
(j B l,2,3,...,r) item response category and 0 otherwise (and alternative 
specification for . Let * 1 if the i-th subject selects 

both item response categories j and k. The total score for a person 
choosing a specific item response category, say j (e^j “ 1) is given 
by: 

*i * jJj e ij e ik*k 



Note that e.. merely designates the item response category of interest, 
■ J 

whereas specifies the subject’s response choice to all items 
including the one of interest. Now, summing over all subjects, the 
sum of the total scores for all persons choosing response category 



J is 



(3) 



N ~ 
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N 

Jx 



Ji e ij e iA 



and the number of subjects choosing item response j is given by 



N 



(4) 
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The mean item response score is 



(5) 
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Step D 

The mean item response scores are now used to assign the derived 
weights Xj to the r item response categories. The frequency distribution 
of the Mj, is divided into L equal area intervals and an integer 
weight assigned to each interval. Then, the interval into which a 
given Mj falls is determined and the derived item response weight 
Xj is the integer number corresponding to this interval. Thus, the 
derived weight Xj is proprotional to the mean item response score 
based upon the weights from the previous iterations. 

Step E 

The criterion used to determine whether the iterative procedures 
should be terminated is the difference between the Hoyt ANOVA 
reliability index (Hoyt, 1941), on two successive iterations. If 
the positive difference is sufficiently small, the most recent set 
of derived weights are considered to be the "optimum 1 ' set. If not, 

X^'s are replaced by the Xj's and steps B-E are repeated. 

This disarmingly simple procedure results in a set of weights 
with very useful psychometric properties. According to Moiser (1946) 
there are: 




"(1) The reliability of each item and the internal 
consistency of the weighted inventory are maximized. (2) 

The correlation between the item and the total score is 
maximized and the product moment correlation coefficient 
becomes identical with the correlation ratio. (3) The 
relative variance of the distribution of scores (coefficient 
of variation) is maximized. (4) The relative variance 
of item scores within a single case is minimized. (5) 

The correlation between an item and total score is 



proportional to the standard deviation of the item 
weights for that item. (6) Questions which bear no 
relation to the total-score variable are automatically 
weighted so that they exert no effect in the scoring." 
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Guttman's Least Squares Model 

Guttman (1941) established a situation identical to that 
described above for the Method of Reciprocal Averages involving 
a population of N subjects and a universe of m items. He wanted 
to obtain a set of item response weights which maximized the ratio 
of the variance between people to the total variance, i.e, maximize 
a correlation ratio. A symbology for this situation can be developed 
by letting X be a diagonal matrix of item response weights 
(Xj,X 2 ,Xj,. .. , xp and letting E by the matrix of e.^ as defined above. 
Now the matrix B is given by: 

(7) B = XE 

where the r rows of B correspond to item response categories , and 
the N columns correspond to subjects. Summing across the rows of B 
within a given column one obtains the sum of the weights, i.e., the 
total scores, for a given subject (i). The arithmetic mean of these 
weights is given by 



a i 



I e ii^k 

k»l 1K k 



, where m (the 



m 



for all individuals . Note that ma. 



total score. The grand mean of all 
matrix B is given by: 



number of items) is the same 



r 

* 1 e.tX. is merely a subject's 
k*l lk k 

the non- zero elements of the 
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( 8 ) 



a 



1 

m 



N r 

1 1 e ik^k 

i=l k=l 1K K 



1 N , N 

W ih “i ’ JT ih ®i 



The variance between people is 



(9) 



i N 

R « i l 

H i-i 



ma^ 



N ma. 

ih^ 



l m 2 a? - mV 

i.l 1 
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The total variance is 



( 10 ) 



N r 



W 



SR 



(c ik k ' 0)2 = SfT Vk ' “ 2 



N 



where il e ik ■ V 



The correlation ratio to be maximized is : 



( 11 ) 



_2 _ R 

\~w 



£ a? - m^Na* 
i=l 1 



h j x Vk -^ 2 



Now because the variances of weights in a given column of B 
are unaffected by a shift in the origin of measurement of the the 
correlation ratio will be invariant to such a shift. A simplification 
can then be achieved by letting 



< 12 > “ * HR £ W ■ ?• 
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Substituting (12) in (11), the correlation ratio to be maximized 
becomes: 
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The reader is referred to Guttman (1941) and Torgerson (1958, p. 338) 
for mathematical details of this maximization process. The net result 
is a system of linear equations which are solved iteratively for the 
derived item response weights. Guttman (1941) recognized that these 
equations were identically those solved by Hotelling (1933) for 
principal components. Thus, he succeeded in showing that internal 
consistency scaling is merely a special case of the more general 
principal components model. Intuitively, one would anticipate that the 
single variable underlying the items would be reflected in the first 
principal component, but, Guttman (1941) shows that the first principal 
component yields an extraneous solution consisting of a vector of 
weights all equal to unity, which maximizes the correlation ratio. 

Thus , the desired weights correspond to the second principal component. 
It appears to the present author that this artifact results from having 
set a ■ 0 prior to maximizing the correlation ratio. In order to 
obtain a nonextraneous set of weights , a constraint must be introduced 
that forces the derived weights to be orthogonal to the extraneous 
weights. Guttman (1941) used the constraint that the sum of the 
n^X k across the response categories of each item must be equal to zero. 
Mos teller (1949) used the less stringent constraint that the sum of the 
n k^k across it®! response categories must be equal to zero which 
is equivalent to requiring the mean score over all subjects to be zero. 
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Torgerson (1958) used this latter constraint when presenting the 
derivation of the Guttman model. 

The linear equations (Torgerson, 1958) resulting from the 
maximization of the correlation ratio are: 



(14) 



n ll*l * n 12*2 
n 22 X l + n 22*2 



"rl X 1 + n 22 X 2 



2 

a ••• * ^lr^r * ^ ^11^1 
** • • • ^ n 2r*r * 




• • • • 



s • • • ^ n^Xr = mn 



subject to the constraint 

n ll X l + n 22 X 2 + ••• + *Vr\ * 

Note that the weights on the left correspond to and on the right 
to Xj . In a more compact notation these equations are: 

r 2 

(15) l n.vXt. » ran n..X.; for each j; 

k*! J K K JJ J 

Subject to the constraint 

(16) I n..X k « 0. 
k*l ** K 



These equations are solved iteratively and one must first create 
the co-occurrence matrix for the item response categories, a r x r 
matrix whose cells are the n^. Then a set of r-1 a priori weights 
(X 1 ,X 2 ,. ..X r-1 ) are selected and the constraint (16) solved for 
the value of X r< These weights are then substituted for the X^ on 
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the left side of (15) and each row can be solved for the value of X . . 

J 

2 

The term mri appears in each row but can be ignored as it is a constant 
of proportionality which is the same for all X.. The X. ’s are then 

J J 

rescaled so that they can be compared with the previous set of weights. 
If they are the same, the iterative process is terminated. If not, 
the derived weights become the X^'s and the process is repeated. 

Guttman (1941) showed that the identical results are obtained if 
one starts out to obtain scores which maximize the ratio of the 
variance between categories to the total variance. He also showed 
that if the correlation between the scores and the weights is maximized, 
the same results are obtained and that the square of the correlation 
ratio is equal to the square of the product-moment correlation. Thus, 
the numerous properties attributed to the Method of Reciprocal Averages 
by Moiser (1946) stem from these relationships. 



Relating the Method of Reciprocal Averages to The Guttman Model 
Inspection of the equations (14) to be solved for the weights 
readily reveals that the sum of the n.^X^ terms on the left hand side 
in a given equation is equal to the sum of the total scores of all 
persons choosing item response j. On the right hand side, the n. . 
term is the number of subjects choosing item response j . The Guttman 
equations (14) can be rewritten as : 
r 



(17) 




n jk X k 





Now, because the numerator of this equation and equation (5) are 
identical it is clear that the derived weights X. are proportional 
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to the mean item response scores, a point made by Guttman. Under 
the Method of Reciprocal Averages the derived weights were also 
proportional to the mean item response scores. The difference between 
the solutions being the nature of this proportionality. In the former, 



weights. In the latter, the integer weight represents a point on the 
frequency distribution of the mean item response scores, hence, is 
proportional to them. In Richardson and Kuder (1933) the mean item 
response scores were used as the derived weights. The use of integer 
weights was introduced by Moiser (1946) in order to simplify the 
computational procedures when accounting machines were used. In 
practice this substitution appears to have negligible effect upon the 
maximization of the internal consistency index. 

Under the Method of Reciprocal Averages , the investigator is 
free to choose a priori weights and need not be concerned directly 
with meeting the constraint imposed upon the equations by Guttman 
(1941) . In that the constraint does not appear explicitly in the 
Method of Reciprocal Averages, it must be met implicitly. The 
constraint (16) can be expressed in terms of the as follows : 



2 

mn is the constant of proportionality which is the same for all 



(18) l n.X. = 
k=l K K 



r 




Now at the end of each iteration 
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(19) 



N e. .T. 



I ", l 

j«l ^ i*l n j 



m 



N 

\ T i 

i*l 



* o ; 



Substituting from (1) for T^, equation (19) becomes 

Nr r N N 

l l EsA = t l e-Jt * 0, but l c ik • it 

i=l k-1 lK K k»l i=l 1K K i«l 1K K 



and one obtains 



(20) Ji Vk * 0 



which is identically the constraint given by Mos teller (1949) and 
Torgerson (1958). Thus, using the mean item response scores as the 
item response weights will meet the constraint and the weights 
corresponding to the second principal component will be obtained. It 
should be noted that the constraint is not immediately satisfied under 
the Method of Reciprocal Averages as the initial a priori weights do 
not meet the constraint, however, the derived weights will meet the 
constraint. 

Fran the above it can be seen that the Method of Reciprocal 
Averages actually implements a solution of Guttman’s equations. The 
technique employs direct computation of the mean item response scores 
rather than obtaining them as the solution to a set of equations. Yet, 
the derived weights satisfy the constraint that insures the derived 
weights and the extraneous set of weights are orthogonal. 
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Computational Considerations 

Although the mathematical model due to Guttman (1941) shows that 
internal consistency scaling can be performed using the computational 
procedures of principal components analysis, there are certain 
computational disadvantages even when digital computers are employed. 

Hie principal components approach requires calculation of the 
co-occurrence matrix for the item-response categories and the size of 
this matrix depends on the total number of item response categories. 

In that most measuring instruments contain many item response categories 
a large co-occurrence matrix results. Methods are available for handling 
large matrices of this type but they are expensive in terms of computer 
memory and execution time. Thus, due to storage requirements, 

Guttman’ s procedure is limited to a modest number of item- response 
categories even when computers are employed. In contrast, the 
computational procedures in the Method of Reciprocal Averages do 
not require a co-occurrence matrix, and the size of the instrument 
to be analyzed is limited only by the length of the vector of item 
response weights. In the current computer program (Baker and Martin, 
1969) the length of this vector is set arbitrarily at 1800. The 
amount of computer time used is a function primarily of the number 
of subjects rather than of the size of the instrument. 

A significant feature of the Method of Reciprocal Averages as 
a scaling technique is that it can be appended to an existing 
item analysis program such as was done by Baker and Martin (1969) . 

The mean item response score is part of the item-criterion correlation 
calculation for both the biserial and point biserial correlations 
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commonly used as item discrimination indices. The calculation of an 
index of internal consistency is routinely part of an item analysis 
program. Therefore to implement the Method of Reciprocal Averages 
one merely adds a simple weight assignment subroutine and a 
convergence test to an item analysis program. 

In present implementations of the Method of Reciprocal Averages , 
the mean item response scores are not used as the derived weights, 
rather, the frequency distribution of the mean item response scores 
is divided into a number of regions and an integer number associated 
with each region serves as the weight rather than the obtained mean 
item response score. The effect of this substitution appears to be 
minor and its simplicity from a computer programing point of view 
outweights other considerations. 

The convergence criterion for the iterative procedures used by 
Guttman (1941) , Moiser (1946) and Hosteller (1949) , was that two 
successive sets of weights did not differ. The weight by weight 
comparison can be avoided if one recalls that maximizing internal 
consistency is the goal of the scaling procedure. Thus, Baker and 
Martin (1969) used the difference between two successive values of 
Hoyt's ANOVA index of internal consistency as the convergence 
variable. When this difference is less than some arbitrarily small 
value, the iterative procedure is terminated. Such a convergence 
criterion is more in keeping with the basic rationale of the scaling 
procedure. 
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Summary 

The Method of Reciprocal Averages first appeared as an informal 
computational procedure in the work of Richardson and Kuder (1933) and 
has been used by psychologists and others since that time. But, since 
its inception, a deficiency of the technique has been the lack of an 
explicit mathematical model. Guttman (1941) had derived a model for 
internal consistency scaling that was equivalent to the principal 
components model due to Hotelling (1933) . In that the Method of 
Reciprocal Averages also maximizes the internal consistency index of 
an instrument for a given population, a connection between the two 
should exist. The present paper used the work of Hoyt and Collier 
(1953), Moiser (1946), Mosteller (1949), and Torgerson (1958) to show 
that the Method of Reciprocal Averages actually solves the equations 
yielded by Guttman* s Least Squares approach under the appropriate 
constraint. 

The Method of Reciprocal Averages has considerable appeal as a 
scaling technique on computational grounds because it does not require 
the computation of an item response co-occurrence matrix. Hence it 
can cope with very large instruments at a reasonable cost as the 
computer time used is a function of the number of subjects rather than 
the size of the instrument. In addition, the computational procedures 
are such that they can be implemented as simple extensions to existing 
item analysis programs. 

In practice, the Method of Reciprocal Averages has proved to be 
an extremely useful scaling procedure. Clarification of its underlying 
mathematical model means that it can be employed with confidence in 
a wide range of psychological, educational and behavioral science research. 
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